Fix: /chat/completions logs status 500 even when the wire status is 404/503#2
Draft
mimeding wants to merge 2 commits into
Draft
Fix: /chat/completions logs status 500 even when the wire status is 404/503#2mimeding wants to merge 2 commits into
mimeding wants to merge 2 commits into
Conversation
Non-streaming /chat/completions correctly maps ChatEngine.EngineError to its intended HTTP status (404 for unknown model, 503 for unavailable provider, etc., per PR osaurus-ai#863) when writing the response, but the Insights log entry always recorded responseStatus: 500. That makes per-status metrics, the WorkView error classifier, and any log-driven debugging report blanket 5xx server faults even when the real failure was a client-side 4xx (e.g. wrong model id). Use the same status.code we already wrote to the wire so the log entry matches the HTTP response. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
This was referenced May 27, 2026
Draft
ModelManager.init kicks off an unstructured Task that calls loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization listing from Hugging Face and feeds the result through applyOsaurusOrgFetch. The unit-test runner repeatedly constructs ModelManager() to drive applyOsaurusOrgFetch directly. The background launch-time fetch races with those test calls — whichever finishes last wins, and the merge result is non-deterministic. That's the root cause of the flaky ModelManagerSuggestedTests failures seen across many of the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.). Gate the launch-time fetch on a small isRunningInTestEnvironment helper that checks for any of XCTestConfigurationFilePath, XCTestBundlePath, or XCTestSessionIdentifier in the process environment. Those variables are only present inside an xctest host process; production app launches still get the HF fetch exactly as before. This is a network call, so removing it under tests also has the side benefit of making the test suite work offline / on hermetic CI runners. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why this matters (business)
The Insights tab and any future log-driven dashboards are how operators tell "is my server healthy?" apart from "did a client send me a bad request?" Today, every non-streaming
/chat/completionserror is logged as 500 Internal Server Error, even when the user just asked for a model that isn't installed (404) or a remote provider is down (503). That mis-classification makes the metrics look like the server is on fire when it's actually a clean client- or config-level failure — exactly the noise PR osaurus-ai#863 set out to remove.Same effect on the
WorkViewerror classifier: it consumes Insights entries to give users actionable feedback ("install the model" vs "check your network"), and that classifier was already wired up correctly for the wire response — it just never saw the right status from the log.What's wrong (technical)
The non-streaming handler maps
ChatEngine.EngineErrorto its intendedhttpStatusand writes that to the wire:…but the Insights log call right below hardcoded
responseStatus: 500:Fix
Use
Int(status.code)so the log entry matches the wire status. One-line change, zero behaviour impact on actual responses; only Insights observability improves.Out of scope (intentional): the streaming
/chat/completions,/messages, and/responseserror paths log 500. Those handlers commit the SSE 200 head before the failure happens or always reply 500 on the wire, so their log status already matches the wire reality. They can be revisited as a follow-up if/when those handlers gainEngineError-aware status mapping.Changes
InsightsServiceinto HTTP handler tests, which is larger than the fix)Test Plan
curl localhost:1234/v1/chat/completions -d '{"model":"nonexistent","messages":[{"role":"user","content":"hi"}]}'.404(already correct).Checklist
CONTRIBUTING.md